Spelling errors are introduced in text either during typing, or when the userdoes not know the correct phoneme or grapheme. If a language contains complexwords like sandhi where two or more morphemes join based on some rules, spellchecking becomes very tedious. In such situations, having a spell checker withsandhi splitter which alerts the user by flagging the errors and providingsuggestions is very useful. A novel algorithm of sandhi splitting is proposedin this paper. The sandhi splitter can split about 7000 most common sandhiwords in Kannada language used as test samples. The sandhi splitter wasintegrated with a Kannada spell checker and a mechanism for generatingsuggestions was added. A comprehensive, platform independent, standalone spellchecker with sandhi splitter application software was thus developed and testedextensively for its efficiency and correctness. A comparative analysis of thisspell checker with sandhi splitter was made and results concluded that theKannada spell checker with sandhi splitter has an improved performance. It istwice as fast, 200 times more space efficient, and it is 90% accurate in caseof complex nouns and 50% accurate for complex verbs. Such a spell checker withsandhi splitter will be of foremost significance in machine translationsystems, voice processing, etc. This is the first sandhi splitter in Kannadaand the advantage of the novel algorithm is that, it can be extended to allIndian languages.
展开▼